140 research outputs found
A Probabilistic Approach to Self-Supervised Learning using Cyclical Stochastic Gradient MCMC
In this paper we present a practical Bayesian self-supervised learning method
with Cyclical Stochastic Gradient Hamiltonian Monte Carlo (cSGHMC). Within this
framework, we place a prior over the parameters of a self-supervised learning
model and use cSGHMC to approximate the high dimensional and multimodal
posterior distribution over the embeddings. By exploring an expressive
posterior over the embeddings, Bayesian self-supervised learning produces
interpretable and diverse representations. Marginalizing over these
representations yields a significant gain in performance, calibration and
out-of-distribution detection on a variety of downstream classification tasks.
We provide experimental results on multiple classification tasks on four
challenging datasets. Moreover, we demonstrate the effectiveness of the
proposed method in out-of-distribution detection using the SVHN and CIFAR-10
datasets
Training Normalizing Flows from Dependent Data
Normalizing flows are powerful non-parametric statistical models that
function as a hybrid between density estimators and generative models. Current
learning algorithms for normalizing flows assume that data points are sampled
independently, an assumption that is frequently violated in practice, which may
lead to erroneous density estimation and data generation. We propose a
likelihood objective of normalizing flows incorporating dependencies between
the data points, for which we derive a flexible and efficient learning
algorithm suitable for different dependency structures. We show that respecting
dependencies between observations can improve empirical results on both
synthetic and real-world data, and leads to higher statistical power in a
downstream application to genome-wide association studies
MixerFlow for Image Modelling
Normalising flows are statistical models that transform a complex density
into a simpler density through the use of bijective transformations enabling
both density estimation and data generation from a single model. In the context
of image modelling, the predominant choice has been the Glow-based
architecture, whereas alternative architectures remain largely unexplored in
the research community. In this work, we propose a novel architecture called
MixerFlow, based on the MLP-Mixer architecture, further unifying the generative
and discriminative modelling architectures. MixerFlow offers an effective
mechanism for weight sharing for flow-based models. Our results demonstrate
better density estimation on image datasets under a fixed computational budget
and scales well as the image resolution increases, making MixeFlow a powerful
yet simple alternative to the Glow-based architectures. We also show that
MixerFlow provides more informative embeddings than Glow-based architectures
Sparse Probit Linear Mixed Model
Linear Mixed Models (LMMs) are important tools in statistical genetics. When
used for feature selection, they allow to find a sparse set of genetic traits
that best predict a continuous phenotype of interest, while simultaneously
correcting for various confounding factors such as age, ethnicity and
population structure. Formulated as models for linear regression, LMMs have
been restricted to continuous phenotypes. We introduce the Sparse Probit Linear
Mixed Model (Probit-LMM), where we generalize the LMM modeling paradigm to
binary phenotypes. As a technical challenge, the model no longer possesses a
closed-form likelihood function. In this paper, we present a scalable
approximate inference algorithm that lets us fit the model to high-dimensional
data sets. We show on three real-world examples from different domains that in
the setup of binary labels, our algorithm leads to better prediction accuracies
and also selects features which show less correlation with the confounding
factors.Comment: Published version, 21 pages, 6 figure
Kernelized Normalizing Flows
Normalising Flows are generative models characterised by their invertible
architecture. However, the requirement of invertibility imposes constraints on
their expressiveness, necessitating a large number of parameters and innovative
architectural designs to achieve satisfactory outcomes. Whilst flow-based
models predominantly rely on neural-network-based transformations for
expressive designs, alternative transformation methods have received limited
attention. In this work, we present Ferumal flow, a novel kernelised
normalising flow paradigm that integrates kernels into the framework. Our
results demonstrate that a kernelised flow can yield competitive or superior
results compared to neural network-based flows whilst maintaining parameter
efficiency. Kernelised flows excel especially in the low-data regime, enabling
flexible non-parametric density estimation in applications with sparse data
availability
- …